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DETAILED ACTION 



Claim Rejections - 35 USC § 103 

The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 

obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed 
or described as set forth in section 102 of this title, if the differences between the 
subject matter sought to be patented and the prior art are such that the subject 
matter as a whole would have been obvious at the time the invention was made 
to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was 
made. 

1. Claims 2, 4-6, 13, 15-18, 23 and 25 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Baruch et al. (U.S. Patent Application Publication 2002/0091518 A1), 
hereinafter referred to as Baruch, in view of Lai et al. (U.S. Patent 6,006,183), 
hereinafter referred to as Lai, Goldhor (U.S. Patent 5,101 ,375), and further in view of 
Scott et al. (U.S. Patent 6,101 ,473), hereinafter referred to as Scott. 

Regarding claim 23, Baruch discloses a voice control system with multiple 
speech recognition engines. Baruch's system includes the ability to input a voice 
command to two recognition engines (abstract, fl3, fl10), which corresponds to 
"providing an audio command to a first speech recognition engine and at least one 
second speech recognition engine"; and to recognize the command with both 
recognition engines generating recognition results fl|9), which corresponds to 
"recognizing the audio command within the first speech recognition engine to generate 
at least one first recognized audio command, . . . ; and recognizing the audio command 
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within the at least one second speech recognition engine, independent of recognizing 
the audio command by the first speech recognition engine, to generate at least one 

second recognized audio command " In addition, Baruch suggests the use of 

confidence levels (1J39) but does not specifically indicate that the two recognizers 
generate confidence values associated with their individual recognition results. 
However, the examiner contends that this concept was well known in the art, as taught 
by Lai. 

Lai discloses a speech recognition confidence level display that produces a score 
(confidence level) for each word that is recognized (col. 2, lines 61-63). 

Therefore, it would have been obvious to one having ordinary skill in the art at 
the time the invention was made to modify Baruch by specifically providing a confidence 
level with each recognition result, as taught by Lai, for the purpose of determining the 
degree of confidence associated with a given recognition event. 

In addition, Baruch teaches the choosing between the first recognized result of 
the first recognition engine and a second recognized result of the second engine flJ9), 
which corresponds to "selecting at least one recognized audio command having a 
recognized audio command confidence value from the at least one first recognized 
audio command and the at least one second recognized audio command based on the 
at least one first confidence value and the at least one second confidence value". But 
Baruch does not specifically teach "inserting the at least one recognized audio 
command within a form". However, the examiner contends that this concept was well 
known in the art, as taught by Goldhor. 
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In the same field of endeavor, Goldhor teaches a report generation method 
where speech recognition can be used to insert text into a report form (abstract, Fig. 2, 
col. 1, lines 26-35, lines 60-65). 

Therefore, it would have been obvious to one having ordinary skill in the art at 
the time the invention was made to modify Baruch by specifically providing the ability to 
insert words into a form using speech recognition, as taught by Goldhor, since this 
approach allows to recognition system to adjust to what is expected in the current entry 
field (col. 1 , lines 30-35). 

Baruch further teaches using the recognizer to choose and transfer a language 
contained in a database or information from a device (PDA) over the communications 
links (H44, penultimate sentence, 1147, Fig. 1, 30, 1|5), but Baruch does not specifically 
teach "accessing an external content server in response to the at least one recognized 
audio command to retrieve encoded information therefrom." However, the examiner 
contends that this concept was well known in the art, as taught by Scott. 

In the same field of endeavor, Scott discloses a method for accessing the 
Internet using a speech recognizer where, for example, a user can get a stock quotes 
over the Internet using a recognizer (Fig. 1, col. 3, Ins. 5-10). 

Therefore, it would have been obvious to one having ordinary skill in the art at 
the time the invention was made to modify Baruch by specifically providing access to 
the Internet, as taught by Scott, to increase access to information. 

Regarding claim 2, Baruch in view of Lai, Goldhor, and Scott teach everything 
claimed, as applied above (see claim 1 ); in addition, Baruch teaches receiving 
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information from a PDA or email handler over a communications link (Fig. 1 , 30, 1|5, 
TJ49), but Baruch does not specifically teach, (a)"receiving the encoded information from 
the content server"; and (b) "decoding the encoded information." However, the 
examiner contends that these concepts were well known in the art, as taught by Scott. 

Scott further discloses receiving information over the Internet (col. 3, Ins. 5-10), 
corresponding to (a), above, where the information received will necessarily require 
decoding (e.g., interpreting HTML formatted data for display), corresponding to (b), 
above. 

Therefore, it would have been obvious to one having ordinary skill in the art at 
the time the invention was made to modify Baruch in view of Lai, Goldhor, and Scott by 
specifically providing the ability to receive and decode data from the Internet, as taught 
by Scott, since information from the Internet is widely viewed as being useful. 

Regarding claim 4, Baruch in view of Lai, Goldhor, and Scott teach everything 
claimed, as applied above (see claim 2); but Baruch in view of Lai, Goldhor, and Scott 
do not specifically teach that "prior to accessing the content server, executing at least 
one operation based on the at least one recognized audio command." However, the 
examiner contends that this concept was well known in the art, as taught by Scott. 

Scott further teaches that a user can tell the speech server to "show me the stock 
quote" to initiate access to a web page (col. 3, Ins. 5-10). 

Therefore, it would have been obvious to one having ordinary skill in the art at 
the time the invention was made to modify Baruch in view of Lai, Goldhor, and Scott by 
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specifically providing content server access, as taught by Scott, for the purpose 
obtaining information from the Internet. 

Regarding claim 5, Baruch in view of Lai, Goldhor, and Scott teach everything 
claimed, as applied above (see claim 2); in addition, Baruch teaches that the voice 
controlled apparatus can give user feedback fl|7), which corresponds to "verifying the at 
least one recognized audio command." 

Regarding claim 6, Baruch in view of Lai, Goldhor, and Scott teach everything 
claimed, as applied above (see claim 23); in addition, Baruch teaches that if a voice 
input is not recognized, the system may provide a visual and/or audible message (fl40), 
which corresponds to "generating an error notification." But Baruch in view of Lai, 
Goldhor, and Scott do not specifically teach that this would occur "when the at least one 
first confidence value and the at least one second confidence value are below a 
minimum confidence level." However it is necessary in a system such as Baruch in 
view of Lai, Goldhor, and Scott's where a recognition decision is made based on 
confidence levels that if the results of both recognition units are below their respective 
minimum confidence levels, an error would result. 

Regarding claim 25, Baruch discloses a voice control system with multiple 
speech recognition engines. Baruch's system includes the ability to input a command 
from a microphone to a recognition engine (abstract, 1J3, 1J10), which corresponds to "a 
first speech recognition means, operably coupled to an audio subsystem, for receiving 
an audio command and generating at least one first recognized audio command." In 
addition, Baruch suggests the use of confidence values (1J39), but Baruch does not 
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specifically indicate, "the at least one first recognized audio command has a first 
confidence value." However, the examiner contends that this concept was well known in 
the art, as taught by Lai. 

Lai discloses a speech recognition confidence level display that produces a score 
(confidence level) for each word that is recognized (col. 2, lines 61-63). 

Therefore, it would have been obvious to one having ordinary skill in the art at 
the time the invention was made to modify Baruch by specifically providing a confidence 
level with each recognition result, as taught by Lai, for the purpose of determining the 
degree of confidence associated with a given recognition event. 

Baruch's system includes the ability to input a command from a microphone to a 
second recognition engine (abstract, 1J3, 1j10, Fig. 1), which corresponds to "a second 
speech recognition means, operably coupled to the audio subsystem, for receiving the 
audio command and generating, independent of the first speech recognition means, at 
least one second recognized audio command." In addition, Baruch suggests the use of 
confidence values fl|39) but does not specifically indicate "each of the at least one 
second recognized audio command has a second confidence value." However, the 
examiner contends that this concept was well known in the art, as taught by Lai. 

Lai further teaches the production of a score (confidence level) for each word that 
is recognized (col. 2, lines 61-63). 

Therefore, it would have been obvious to one having ordinary skill in the art at 
the time the invention was made to modify Baruch by specifically providing a confidence 
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level with each recognition result, as taught by Lai, for the purpose of determining the 
degree of confidence associated with a given recognition event. 

Baruch's system has a control unit 18 connected directly to the recognition 
engines (H23) and the ability to choose between the recognition results from the first 
and second recognizers (119, Fig. 1), which corresponds to "a means, operably coupled 
to the first speech recognition means and the second speech recognition means, for 
receiving the at least one first recognized audio command and the at least one second 
recognized audio command." In addition, Baruch teaches: that the recognition engines 
are connected to a control unit 18 which is connected to the engine association unit 20 
and is also connected to the digital communication unit 30 (Fig. 1,18), which 
corresponds to "a dialog manager operably coupled to the first speech recognition 
means and the second speech recognition means But Baruch does not specifically 
teach that the dialog manager is "operably coupleable to an external content server". 
However, the examiner contends that this concept was well known in the art, as taught 
by Scott. 

In the same field of endeavor, Scott teaches the use of a speech/web browser 7 
that is TCP linked 13 to the Internet 2 that can be used to access information (col. 3, 
Ins. 3-10). 

Therefore, it would have been obvious to one having ordinary skill in the art at 
the time the invention was made to modify Baruch by specifically providing the 
techniques, as taught by Scott, to allow convenient access to information on the 
Internet. 
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Furthermore, Baruch teaches that the recognition results go to the control unit 18 
and the result of the speech recognition is used in command and control applications 
such as the retrieval of messages or data from a PDA (1J3, H49), which corresponds to 
"the dialog manager determines a dialog manager audio command from the at least one 
recognized command confidence levels". 

But Baruch does not specifically teach "inserting the dialog manager audio 
command within a form". However, the examiner contends that this concept was well 
known in the art, as taught by Goldhor. 

In the same field of endeavor, Goldhor teaches a report generation method 
where speech recognition can be used to insert text into a report form (abstract, Fig. 2, 
col. 1 , lines 26-35, lines 60-65). 

Therefore, it would have been obvious to one having ordinary skill in the art at 
the time the invention was made to modify Baruch by specifically providing the ability to 
insert words into a form using speech recognition, as taught by Goldhor, since this 
approach allows to recognition system to adjust to what is expected in the current entry 
field (col. 1, lines 30-35). 

Furthermore, Baruch does not specifically teach "such that the dialog manager 
access the external the content server in response to the dialog manager audio 
command to retrieve encoded information therefrom." However, the examiner contends 
that this concept was well known in the art, as taught by Scott. 
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Scott further discloses a method for accessing the Internet using speech 
recognition where for example a user can get a stock quote over the Internet using a 
recognizer (Fig. 1, col. 3, Ins. 5-10). 

Therefore, it would have been obvious to one having ordinary skill in the art at 
the time the invention was made to modify Baruch by specifically providing access to 
the Internet, as taught by Scott, to increase access to information. 

Regarding claim 13, Baruch in view of Lai, Goldhor, and Scott teach everything 
claimed, as applied above (see claim 25). In addition, Baruch teaches the choosing 
between a first recognized result of the first engine and a second recognized result of 
the second engine where the recognition units are coupled to a control unit (1J9, Fig. 1, 
18), which corresponds to "a dialog manager operably coupled to the means for 
receiving, wherein the means for receiving selects at least one recognized audio 
command having a recognized confidence value from the at least one first recognized 
audio command and the at least one second recognized audio command based on the 
at least one first confidence value and the at least one second confidence value." 

Regarding claim 15, Baruch in view of Lai, Goldhor, and Scott teach everything 
claimed, as applied above (see claim 25). In addition, Baruch teaches that through 
voice commands a user can access a list of previously selected languages or email 
messages fl}44, 1J49), but neither Baruch nor Baruch in view of Lai, Goldhor, and Scott 
specifically teach, "wherein the dialog manager retrieves encoded information in 
response to the dialog manager audio command." However, the examiner contends 
that this concept was well known in the art, as taught by Scott. 



Application/Control Number: 10/034,542 Page 1 1 

Art Unit: 2654 

Scott further teaches the use of a speech/web browser 7 that is TCP linked 13 to 
the Internet 2 that can be used to access information (col. 3, Ins. 3-10). 

Therefore, it would have been obvious to one having ordinary skill in the art at 
the time the invention was made to modify Baruch in view of Lai, Goldhor, and Scott by 
specifically providing the techniques, as taught by Scott, to allow convenient access to 
information on the Internet. 

Regarding claim 16, Baruch in view of Lai, Goldhor, and Scott teach everything 
claimed, as applied above (see claim 15). In addition, Baruch teaches that a list of 
requested languages may be provided by loudspeaker (1144), which corresponds to "a 
speech synthesis engine operably coupled to the dialog manager, wherein the speech 
synthesis engine receives speech encoded information from the dialog manager and 
generates speech formatted information." 

Regarding claim 17, Baruch in view of Lai, Goldhor, and Scott teach everything 
claimed, as applied above (see claim 16). In addition, Baruch teaches that a speaker 34 
is attached to a digital communication unit 30 and a control unit 18, and that this 
subsystem can generate audio prompts (1J41), which corresponds to "the audio 
subsystem is operably coupled to the speech synthesis engine, wherein the audio 
subsystem receives the speech formatted information and provides an output 
message." 

Regarding claim 18, Baruch in view of Lai, Goldhor, and Scott teach everything 
claimed, as applied above (see claim 17). In addition, Baruch teaches that if the input is 
not recognized an audible message may be given (1J41), which corresponds to "the 
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means for receiving provides the dialog manager with an error notification, the output 
message is an error statement." 



2. Claims 3, 8-1 1,21, 22, 24 and 26 are rejected under 35 U.S.C. 1 03(a) as being 
unpatentable over Baruch in view of Lai, Scott, and Goldhor and further in view of Baker 
(U.S. Patent No. 6,122,613). 

Regarding claim 3, Baruch in view of Lai, Scott, and Goldhor teach everything 
claimed, as applied above (see claim 2), but Baruch in view of Lai, Goldhor, and Scott 
do not specifically teach "weighting the at least one first confidence value by a first 
weight factor and weighting the at least one second confidence values by a second 
weight factor." However, the examiner contends that this concept was well known in the 
art, as taught by Baker. 

Baker discloses a voice control system with multiple voice recognition engines 
where the combining of the recognition results based on the weighting factors (col. 3, 
38-42). 

Therefore, it would have been obvious to one having ordinary skill in the art at 
the time the invention was made to modify Baruch in view of Lai, Goldhor, and Scott by 
specifically weighing the results from each recognizer, as taught by Baker, for the 
purpose of assigning a greater weight to the recognizer known to be more accurate 
(Baker, col. 3, line 42). 

Regarding claim 24, Baruch discloses a voice control system with multiple 
speech recognition engines 10. Baruch's system includes the ability to input an audio 
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command to two recognition engines (abstract, 1J3, 1P0), which corresponds to 
"providing an audio command to a terminal speech recognition engine and at least one 
. . . [additional] speech recognition engine; recognizing the audio command within the 
terminal speech recognition engine to generate at least one terminal recognized audio 
command." Baruch suggests that use of confidence levels (1J39) but does not 
specifically teach "wherein the at least one terminal [or network] recognized audio 
command has a corresponding terminal confidence value." However, the examiner 
contends that this concept was well known in the art, as taught by Lai. 

Lai discloses a speech recognition confidence level display and teaches the 
production of a score (confidence level) for each word that is recognized (col. 2, lines 
61-63). 

Therefore, it would have been obvious to one having ordinary skill in the art at 
the time the invention was made to modify Baruch by specifically providing a confidence 
level with each recognition result, as taught by Lai, for the purpose of determining the 
degree of confidence associated with a given recognition event. 

In addition, Baruch does not specifically teach that the second recognizer is a 
network speech recognition engine. However, the examiner contends that the concept 
of the use of a second recognizer connected on a network was well known in the art, as 
taught by Baker. 

Barker teaches speech recognition using two recognizers applied to the same 
input sample, where the second recognizer can be a network device (Fig. 3, Fig. 5). 
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Therefore, it would have been obvious to one having ordinary skill in the art at 
the time the invention was made to modify Baruch by specifically providing an additional 
recognizer accessed by a network connection, as taught by Baker, for the purpose of 
providing access to a more powerful recognizer through a network connection. 

In addition, Baruch does not specifically teach the step of "recognizing the audio 
command within the at least one network speech recognition engine to generate at least 
one network recognized audio command, wherein the at least one network recognized 
audio command has a corresponding network confidence value." However, the 
examiner contends that this concept was well known in the art, as taught by Baker. 

Baker further teaches that the output of the network recognizer is assigned a 
score (or confidence level) (abstract, Fig. 5). 

Therefore, it would have been obvious to one having ordinary skill in the art at 
the time the invention was made to modify Baruch in view of Lai and Scott, as taught by 
Baker, to aid in the decision process when selecting from between the recognition 
candidates. 

In addition, Baruch in view of Lai, Scott and Baker teach: the choosing between a 
first recognized result of the first engine and a second recognized result of the second 
engine (Baruch, 1J9) where when confidence values are used, as taught above by Lai 
and Baker, these values would be used in the recognition selection process, which 
corresponds to "selecting at least one recognized audio command having a recognized 
audio command confidence value from the at least one terminal recognized audio 
command and the at least one network recognized audio command." 
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But Baruch does not specifically teach "inserting the at least one recognized 
audio command with a form." However, the examiner contends that this concept was 
well known in the art, as taught by Goldhor. 

In the same field of endeavor, Goldhor teaches a report generation method 
where speech recognition can be used to insert text into a report form (abstract, Fig. 2, 
col. 1 , lines 26-35, lines 60-65). 

Therefore, it would have been obvious to one having ordinary skill in the art at 
the time the invention was made to modify Baruch by specifically providing the ability to 
insert words into a form using speech recognition, as taught by Goldhor, since this 
approach allows to recognition system to adjust to what is expected in the current entry 
field (col. 1, lines 30-35). 

In addition, Baruch teaches using the recognizer to choose and transfer a 
language contained in a database or information from a device (PDA) over the 
communications links (Baruch, 1J44, penultimate sentence, 1J47, Fig. 1, 30, 1f5), but 
neither Baruch nor Baruch in view of Lai, Scott, Goldhor, and Baker teach, "accessing 
an external content server in response to the at least one recognized audio command to 
retrieve encoded information therefrom." However, the examiner contends that this 
concept was well known in the art, as taught by Scott. 

Scott further discloses a method for accessing the Internet using a speech 
recognizer to get a stock quote over the Internet (Fig. 1, col. 3, Ins. 5-10). 
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Therefore, it would have been obvious to one having ordinary skill in the art at 
the time the invention was made to modify Baruch by specifically providing access to 
the Internet, as taught by Scott, to increase access to information. 

Regarding claim 8, Baruch in view of Lai, Scott, Goldhor, and Baker teach 
everything claimed, as applied above (see claim 24); in addition, Baruch teaches that if 
a voice input is not recognized, the system may provide a visual and/or audible 
message fl|40), which corresponds to "prior to accessing a content server, generating 
an error notification." But Baruch in view of Lai, Scott, Goldhor, and Baker do not 
specifically teach that this would occur "when the at least one terminal confidence value 
and the at least one network confidence value are below a minimum confidence level." 
However, it is necessary in a system such as Baruch in view of Lai, Scott and Goldhor's 
when a recognition decision is made based on confidence levels that if the results of 
both recognition units are below the respective minimum confidence levels, an error 
would result. 

Regarding claim 9, Baruch in view of Lai, Scott, Goldhor, and Baker teach 
everything claimed, as applied above (see claim 24), but Baruch in view of Lai, Scott, 
Goldhor, and Baker do not specifically teach "weighting the at least one terminal 
confidence value by a terminal weight factor and the at least one network confidence 
value by a network weight factor." However, the examiner contends that this concept 
was well known in the art, as taught by Baker. 

Baker further teaches the combining of the recognition results based on the 
weighting factors (col. 3, 38-42). 
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Therefore, it would have been obvious to one having ordinary skill in the art at 
the time the invention was made to modify Baruch in view of Lai, Scott and Baker by 
specifically weighing the results from each recognizer, as taught by Baker, for the 
purpose of assigning a greater weight to the recognizer known to be more accurate 
(Baker, col. 3, line 42). 

Regarding claim 10, Baruch in view of Lai, Scott, Goldhor, and Baker teach 
everything claimed, as applied above (see claim 24) including the assignment of a 
confidence level to the recognition events (Lai, col. 2, lines 61-63; Baker, abstract), but 
Baruch in view of Lai, Scott, Goldhor, and Baker do not specifically teach "filtering the at 
least one recognized audio command based on the at least one recognized audio 
command confidence value." However, the examiner contends that this concept was 
well known in the art, as taught by Lai. 

Lai further discloses the ability to select score thresholds above or below which 
recognized words are displayed (col. 3, lines 36-40) 

Therefore, it would have been obvious to one having ordinary skill in the art at 
the time the invention was made to modify Baruch in view of Lai, Scott, Goldhor, and 
Baker by specifically supporting the filter capability, as taught by Lai, for the purpose of 
determining what the minimum confidence level for recognition will be. 

In addition, Baruch teaches the choosing of a command based on the results 
from the recognizers fl|9) where the choice would necessarily be the command with the 
highest confidence, and in a control system such as Baruch's (abstract, 1J19) the 
recognized command would necessarily be executed, which corresponds to "executing 
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an operation based on the recognized audio command having the highest recognized 
audio command confidence value." 

Regarding claim 11, Baruch in view of Lai, Scott, Goldhor, and Baker teach 
everything claimed, as applied above (see claim 24); in addition Baruch teaches the 
ability of the system to get confirmation from the user (1J50), which corresponds to 
"verifying the at least one recognized audio command to generate a verified recognized 
audio command"; and in a control system such as Baruch's (abstract, 1|19) the 
execution of the command would necessarily follow the affirmation, which corresponds 
to "executing an operation based on the verified recognized audio command." 

Regarding claim 26, Baruch discloses a voice control system with multiple 
speech recognition engines 10. Baruch's system includes the ability to input an audio 
command into a microphone 12 connected to a recognition engine (abstract, fl3, H10), 
which corresponds to "a terminal speech recognition engine operably coupled to a 
microphone and coupled to receive an audio command and generate at least one 
terminal recognized audio command." But Baruch does not specifically teach "wherein 
the at least one terminal recognized audio command has a corresponding terminal 
confidence value." However, the examiner contends that this concept was well known 
in the art, as taught by Lai. 

Lai discloses a speech recognition confidence level display, which indicates the 
production of a score (confidence level) for each word that is recognized (col. 2, lines 
61-63). 
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Therefore, it would have been obvious to one having ordinary skill in the art at 
the time the invention was made to modify Baruch by specifically providing a confidence 
level with each recognition result, as taught by Lai, for the purpose of determining the 
degree of confidence associated with a given recognition event. 

In addition, Baruch does not specifically disclose "at least one network speech 
recognition engine operably coupled to the microphone and coupled to receive the 
audio command and generate at least one network recognized audio command, 
independent of the terminal speech recognition engine, wherein the at least one 
network recognized audio command has a corresponding network confidence value." 
However, the examiner contends that the concept of the use of a second recognizer 
connected on a network was well known in the art, as taught by Baker. 

Barker teaches speech recognition using multiple recognizers applied to the 
same input sample, where the second recognizer can be a network device and that a 
confidence value is associated recognition candidates (abstract, Fig. 3, 315, 309). 

Therefore, it would have been obvious to one having ordinary skill in the art at 
the time the invention was made to modify Baruch by specifically providing an additional 
recognizer accessed by a network connection, as taught by Baker, for the purpose of 
providing access to a more powerful recognizer through a network connection. 

In addition, Baruch teaches the connecting of the recognition engines to a control 
unit where the results are send 18, which corresponds to "a comparator operably 
coupled to the terminal speech recognition engine operably coupled to receive the at 
least one terminal recognized audio command and further operably coupled to the at 
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least one network speech recognition engine operably coupled to receive the at least 
one network recognized audio command." 

In addition, Baruch in view of Lai, Scott, Goldhor and Baker disclose the 
choosing between a first recognized result of the first engine and a second recognized 
result of the second engine (Baruch, 1J9) and where confidence values are used in the 
recognition selection process, as taught above by Lai and Baker, which corresponds to 
"a dialog manager operably coupled to the comparator, wherein the comparator selects 
at least one recognized audio command having a recognized confidence value from the 
at least one terminal recognized audio command and the at least one network 
recognized audio command based on the at least one terminal confidence value and 
the at least one network confidence value." 

Baruch also teaches: the system my require confirmation before proceeding 
fl|50), which corresponds to "the selected at least one recognized audio command is 
provided to the dialog manager"; a choice is made in the control unit (dialog manager) 
between the recognition results of two recognizers where a decision rule might be 
applied based on confidence level (abstract, 119, fl39, 1J50), which corresponds to "a 
dialog manager audio command determined by the dialog manager from the at least 
one recognized audio commands based on the at least one recognized audio command 
confidence levels such that the dialog manager." But Baruch does not specifically teach 
"inserts the dialog manager command within a form." However, the examiner contends 
that this concept was well known in the art, as taught by Goldhor. 
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In the same field of endeavor, Goldhor teaches a report generation method 
where speech recognition can be used to insert text into a report form (abstract, Fig. 2, 
col. 1 , lines 26-35, lines 60-65). 

Therefore, it would have been obvious to one having ordinary skill in the art at 
the time the invention was made to modify Baruch by specifically providing the ability to 
insert words into a form using speech recognition, as taught by Goldhor, since this 
approach allows to recognition system to adjust to what is expected in the current entry 
field (col. 1, lines 30-35). 

Furthermore, Baruch's system includes the control unit 18 is connected to both 
the engine association unit 20 and the digital communication unit 30 both of which can 
access databases (1J47-49), but neither Baruch nor Baruch in view of Lai, Scott, 
Goldhor, and Baker specifically teach, "the dialog manager being operably coupleable 
an external to a content server such that the operation executed by the dialog manager 
includes accessing the external content server to retrieve encoded information 
therefrom." However, the examiner contends that this concept was well known in the 
art, as taught by Scott. 

In the same field of endeavor, Scott discloses a method for accessing the 
Internet using speech recognition where for example a user can get a stock quote over 
the Internet using a recognizer (Fig. 1, col. 3, Ins. 5-10). 

Therefore, it would have been obvious to one having ordinary skill in the art at 
the time the invention was made to modify Baruch by specifically providing access to 
the Internet, as taught by Scott, to increased access to more information. 
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Regarding claim 21 , Baruch in view of Lai, Scott, Baker and Goldhor teach 
everything claimed, as applied above (see claim 26). In addition, Baruch discloses a 
technique were a user may call up a list of languages or email based on a command 
(H44, H48-49), but neither Baruch nor Baruch in view of Lai, Scott, Baker and Goldhor 
specifically teach "wherein the dialog manager retrieves the encoded information from 
the content server in response to the dialog manager audio command." However, the 
examiner contends that this concept was well known in the art, as taught by Scott. 

Scott further discloses receiving information over the Internet in response to a 
spoken command (col. 3, Ins. 5-10) (normally encoded as HTML formatted data for 
display). 

Therefore, it would have been obvious to one having ordinary skill in the art at 
the time the invention was made to modify Baruch in view of Lai, Scott, Baker, and 
Goldhor by specifically providing the ability to receive and decode data from the 
Internet, as taught by Scott, since information from the Internet is widely viewed as 
being useful. 

Regarding claim 22, Baruch in view of Lai, Scott, Baker and Goldhor teach 
everything claimed, as applied above (see claim 21 ). In addition, Baruch discloses a 
loudspeaker for audible output messages that is connected to the control unit through 
the digital communications unit (Fig. 1, fl41), which corresponds to "wherein the speech 
synthesis engine receives speech encoded information from the dialog manager and 
generates speech formatted information; and a speaker operably coupled to the speech 




Application/Control Number: 10/034,542 



Page 23 



Art Unit: 2654 

synthesis engine, wherein the speaker receives the speech formatted information and 
provides an output message." 



THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time 
policy as set forth in 37 CFR 1 .136(a). 

A shortened statutory period for reply to this final action is set to expire THREE 
MONTHS from the mailing date of this action. In the event a first reply is filed within 
TWO MONTHS of the mailing date of this final action and the advisory action is not 
mailed until after the end of the THREE-MONTH shortened statutory period, then the 
shortened statutory period will expire on the date the advisory action is mailed, and any 
extension fee pursuant to 37 CFR 1 .136(a) will be calculated from the mailing date of 
the advisory action. In no event, however, will the statutory period for reply expire later 
than SIX MONTHS from the mailing date of this final action. 

Any response to this office action should be mailed to: 

Commissioner of Patents and Trademarks 
P.O. Box 1450 
Alexandria, VA 22313-1450 

or faxed to: 



Conclusion 



(703) 872-9314 
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Hand-delivered responses should be brought to: 

Crystal Park II 

2121 Crystal Drive 

Arlington, VA. 

Sixth Floor (Receptionist) 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Dr. V. Paul Harper whose telephone number is (703) 
305-4197. The examiner can normally be reached on Monday through Friday from 8:00 
a.m. to 4:30 p.m. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Richemond Dorvil, can be reached on (703) 305-9645. The fax phone 
number for the Technology Center 2600 is (703) 872-9314. 

Any inquiry of a general nature or relating to the status of this application or 
proceeding should be directed to the Technology Center 2600 Customer Service office 
whose telephone number is (703) 306-0377. 



VPH/vph 
October 10, 2003 
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